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& Abstract 



i_i We consider a general class of superadditive scores measuring the similarity of two 

independent sequences of n i.i.d. letters from a finite alphabet. Our object of 
interest is the mean score by letter l n . By subadditivity l n is nondecreasing and 
converges to a limit I. We give a simple method of bounding the difference I — l n 
and obtaining the rate of convergence. Our result generalizes the previous result of 
Alexander [I] , where only the special case of the longest common subsequence was 
considered. 

Keywords. Random sequence comparison, longest common sequence, rate of 
convergence. 
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1 Introduction 

Throughout this paper X\, X2, ■ ■ . and Yi, Y2, . . . are two independent sequences of 
i.i.d. random variables drawn from a finite alphabet A and having the same distri- 
bution. Since we mostly study the finite strings of length n, let X = {X\, X2, ■ ■ ■ X n ) 
and let Y = (Yi,Y2, . . . Y n ) be the corresponding n-dimensional random vectors. We 
shall usually refer to X and Y as random sequences. 
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The problem of measuring the similarity of X and Y is central in many areas of 
applications including computational molecular biology dH [231 I2H ES] and com- 
putational linguistics [HJ [181 1201 EI]- In this paper, we consider a general scoring 
scheme, where S : A x A — > M + is a pairwise scoring function that assigns a score 
to each couple of letters from A. We assume S to be symmetric and we denote by 
F and A the largest possible score and the largest possible change of score by one 
variable, respectively. Formally (recall that S is symmetric) 

F := max S(a, b), A := max \S(a, b) — S(a, c)|. 

An alignment is a pair (jr, fj,) where 7r = (vri, 7T2, . . . , 7Tfc) and fx = (/ii, • • • , Mfc) 
are two increasing sequences of natural numbers, i.e. 1 < tt\ < 1T2 < ... < vr^ < n 
and 1 < H\ < fj,2 < • • • < Hk < n - The integer /c is the number of aligned letters 
and n — k is the number of gaps in the alignment. Note that our definition of 
gap slightly differs from the one that is commonly used in the sequence alignment 
literature, where a gap consists of maximal number of consecutive indels (insertion 
and deletion) in one side. Our gap actually corresponds to a pair of indels, one in 
X-side and another in Y side. Since we consider the sequences of equal length, to 
every indel in X-side corresponds an indel in y-side, so considering them pairwise 
is justified. In other words, the number of gaps in our sense is the number of indels 
in one sequence. We also consider a gap price 5. Given the pairwise scoring function 
S and the gap price 5, the score of the alignment (tt, fi) when aligning X and Y is 
defined by 

k 

U M (X,Y) := £s(X ffl , Y w ) + 8{n - k). 

i=l 

In our general scoring scheme 5 can also be positive, although usually 5 < penal- 
izing the mismatch (in this case —5 is usually called the gap penalty). We naturally 
assume 5 < F. 

The (optimal) score of X and Y is defined to be best score over all possible align- 
ments, i.e. 

L n :=L(X;Y) := max U (X, Y) . 

The alignments achieving the maximum are called optimal Such a similarity cri- 
terion is most commonly used in sequence comparison [3l [T^l [Ml l25j 126] . When 
S(a, b) = 1 for a = b and S(a, b) = for a / b, then for 5 = the optimal score is 
equal to the length of the longest common subsequence (LCS) of X and Y. 
It is well-known that the sequence EL n , n = 1, 2 ... is superadditive, i.e. EL n+m > 
EL n + EL m for all n,m > 1. Hence, by Fekete's lemma the ratios l n := are 
nondecreasing and converge to the limit 

I := ]iml n = supZ n . 

» n 

In fact, from Kingman's subadditivity ergodic theorem, it follows that I is also the 
a.s. limit of The limit I (which for the LCS-case is called Chvatal-Sankoff con- 
stant) is not known exactly even for the simplest scoring scheme and the simplest 
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model for X and Y, so it is usually estimated by simulations. Using McDiarmid's 



inequality (see (3.6)) one can estimate l n with prescribed accuracy; to obtain con- 
fidence intervals for I, the difference I — l n should be estimated. This is the aim of 
the present paper. 

To our best knowledge, the difference I — l n has been theoretically studied only by 
Alexander in [1], though there exist many numeric results on the value of l n or its 
distribution in various contexts [H El El [Til D21 [13 US H3 [22] . Alexander proved 
that in the case of the LCS, for any C > (2 + v2) there exists an integer n (C) such 
that 



I - L < C 



log n 



n 



provided n > n (C). 



The bound (1.1) is independent of the common law of X and Y, and the integer 



n (C) can be exactly determined. Hence the bound (1.1) can be used for the cal- 
culation of explicit confidence intervals. 



Our main result is the following: 

Theorem 1.1 Let n 6 N be even. Then, with any c > \J~A 



I — L < c 



n 



1 



n + 1 
1 



n 



+ ln(n- 1)1 + 



F 



n 



1 



1.2) 



Note that by the monotonicity of l n , the assumption on n even actually is not 
restrictive. In fact, Alexander's main result (Prop. 2.4 in [JJ) is also proven for n 
even. Theorem 1 1 . 1 1 and its proof generalize Alexander's result in many ways: 

1. Theorem [TTTJ applies for a general scoring scheme, not just for the LCS. This is 
due to the fact that our proof is based solely on McDiarmid's large deviation 
equality, whilst Alexander's proof, although using also McDiarmid's inequality, 
is mainly based on first passage percolation techniques. Despite the fact that 
the percolation approach applies in many other situations rather than sequence 
comparison (see [2] ) , it is not clear whether it can be efficiently applied to our 
general scoring scheme. For McDiarmid's inequality, however, it makes no 
difference what kind of scoring is used. This gives us reasons to believe that 
our proof is somehow "easier" than the one in [TJ. 

2. The proof of Theorem 1 1 . 1 1 relates the rate of the convergence of l n to the car- 
dinality of the set of partitions Bk, n (see Lemma 3.1 ) so that finding the good 
rate boils down to the good estimation of |0fc n |. The bound (1.2) corresponds 
to a particular estimate of |23fc in |, any better estimate would give a sharper 
bound and, probably, also a faster rate. In a sense, the cardinality |£>fc jn | could 
be interpreted as the complexity of the model and the relation between the 
rate of convergence and the complexity of the model is a well-known fact in 
statistics (see e.g. [5]). 

3. When applied to the LCS, our bound (1.2) is sharper than (1.1). Indeed, for 



the case of LCS the constants A and F in (1.2) can be taken equal to one and 



the smaller constants make the difference. In other words, for the case of LCS 
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both results yield the rate Cy but the constant C is different (C > 3.42 
in Alexander's result and c > \/2 in ours). 

We can easily compare ( pTT2| ) and ((Ll]) by comparing the decay of the two following 
functions: 



R: {1,..., 10000} 

R(n) 

Q F : {1,..., 10000} x {0.1,..., 2} 

Q F (n,A) 



(3.42 + 0.1) 



Inn 

n 



;i.3) 



+ ln(n - 1) + 



n — 1 Vra — 1 



n — 1 
(1.4) 

In figure [TJ we can see the improved bound (1.2 1 given by function (1.4) (changes 
of A are represented in colours, F = 1) over the bound by Alexander (1.1) given 
by function ( |1.3[ ) (in black). Note that the dark blue curve corresponds to A = 0.1 
whilst the light violet curve to A = 2 (namely, the colour gets lighter as A increases) . 
The curve in green corresponds to the case A = 1 (ie, our bound for the LCS case). 



Numeric comparison of the bounds by comparing decay functions 




Figure 1: comparison of the bounds (1.2) and (1.1) through the functions (1.4) and (1.3) 
respectively. 
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2 Confidence bounds for / 



Suppose that k samples of X 1 = X\,...,X l n and Y % = Y{, . . . , Y£, i = 1,...,N 
are generated. Let L l n be the score of the i-th sample. Thus EL l n = nl n . By 
McDiarmid's inequality (see (3.5) below), for every p > 



P \ j— ^ L„ - l n < -p ] = P ( V„ - kn l n < -knp ] < exp 

\ kn U J yti J 



p 2 kn 



(2.1) 



Let 



L 



1 k 



If n is even, by (1.2) and (1.4) we have that I < l n + Qp(n, A) and then 



P(L n + p + Q F (n,^) > Z) > P(L n + p > l n ) = P(L n -l n > -p) > 1-exp 



p 2 kn 



(2.2) 

Now, given e > 0, choose p = p(n, e) so that the right hand side in the last inequality 
is equal to 1 — e: 



p{n,e) = A 



kn 



So, with probability 1 — e, we obtain one side confidence interval as follows: 

/ln(l/e) 



I < L n + Q F (n,A) + A 



kn 



(2.3) 



In statistical learning, the inequalities of type (2.3) are known as PAC inequality 
(probably almost correct inequalities). The two-sided confidence bounds are, with 
probability 1 — e, as follows: 



1,, - Aj l ^<l<L n + Q F (n,A) + Aj H2/£) 



kn 



kn 



(2.4) 



The bounds in (2.4) suggest to use the estimate 

f _ - Q F (n,A) 

so that the confidence bounds for this estimate are 



> l-e. 



(2.5) 



Alexander [T] obtained, for n = 100000, k = 2 and A = F = 1 (for the LCS case), 
the following bounds: 

P(\L -l\< 0.0264) > 0.95. (2.6) 
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By using ( |2T5[ ) and ( |L4| ) we obtain, for n = 100000, k = 2 and A = F = 1 (for the 
LCS case), the following bounds: 



P(\ln ~l\< 0.0122) > 0.95. 



(2.7) 



It is clear that (2.7) is sharper than (2.6). To our best knowledge, the best previous 
lower and upper bounds for I, in the LCS context for A = {0, 1}, were due to Dancik 
[ID], Dancik and Paterson pEH[22] (0.773911 and 0.837623, respectivley) and Lueker 
PU (0.788071 and 0.826280, respectively). 



Remark: The inequality (2.3) confirms the well-known fact that it is better to 



generate one sample of length kn rather than k samples of length n. Indeed, with 



one sample of length kn, the inequality (2.3) becames 



I < L n + Q F (kn,A) + A 



ln(l/e) 



kn 



and since Qp(kn,A) < Qp(n,A), the bounds get narrower. 



3 Proof of the main result 

3.1 The set of partitions Bk, n 

In this section, we shall consider the sequences X and Y with length kn where k, n 
are nonnegative integers. Let (jr, u) be an arbitrary alignment of X and Y . Let 
v = (yi, . . . , v r+ {) and r = (ti, . . . , T r+ \) be vectors satisfying 

1 = v\ < Vi < . . . < v r < f r +i = kn + 1, 1 = n < T2 < . . . < T r < T r+ i = kn + 1. 

. . (3 ' 1} 

We say that the pair (u, r) forms a r-partition of the alignment (ir, u) if for any 
j = 1, . . . ,r, the following conditions are simultaneously satisfied: 

1) if, for some i = 1, ... k, it holds that za,- < 7Tj < lA'+i, then Tj < Ui < Tj+i; 

2) if, for some z = 1, ... k, it holds that Tj < Ui < Tj+i, then Uj < 7Tj < ^j+i- 

Thus (y, t) is a r-partition, if the sequences X and Y can be partitioned into r 
pieces 

(Xl, . . . , Xy 2 _i), (X U2 , . . . , X^-x), . . . , (X^, . . . , Xkn) 
(Yi , . . . , ^r 2 — l) ! 0^T2 > • • • ) 5^73— l) j • • • j (^r r j • • ■ j ^fen) 

such that the alignment (jr, fj,) aligns a piece (X v . , . . . , X I/ . +1 _i) with the piece 
(y Tj . , . . . , Y T . +1 -i), where j = 1, . . . r. It is important to note that the pieces might be 
empty, i.e. it might be that Uj = Uj+i (or Tj = Tj+i), meaning that (tj, . . . , tj + i — 1) 
cannot contain any elements of u, otherwise the requirement 2) would be violated 
(or (fij, . . . , Uj+i — 1) cannot contain any elements of r, otherwise the requirement 
1) would be violated). Hence, if for a partition a piece of X is empty, then the 
corresponding piece of Y cannot have any aligned letter. 



6 



The following observation shows that any alignment of X and Y can be parti- 
tioned into r pieces such that k < r < \ 1 an d such that the sum of the lengths 
of aligned pairs in each partition is always at most 2n. We believe that the idea of 
the proof as well as the meaning of the partition becomes transparent by an example. 

Example. Let n = 3, k = 4. Let vr = (1,5,6,9,10,12) and n = (2,3,4,6,9,10). 
The alignment (ir, n) can be represented as follows 



X 




1 


2 


3 


4 


5 


6 


7 


8 




9 






10 


11 


12 






Y 


1 


2 








3 


4 






5 


6 


7 


8 


9 




10 


11 


12 



The table above indicates that X\ is aligned with Y2, X^ is aligned with I3 and 
so on; the rest of the letters are unaligned, so we say that they are aligned with 
gaps. In the table, there are two types of columns: the columns with two figures 
(aligned pairs) and the columns with one figure (unaligned pairs). Let Ui G {1,2} 
be the number of figures in the i-th column, and let Sj = u\ + ■ ■ ■ + Uj be the 
corresponding cumulative sum. To get an r-partition proceed as follows: start from 
the beginning of the table (most left position) and find j such that sj = 2n. Since 
the cumulative sum increases by one or two, such a j might not exist. In this case 
find j such that sj = 2n — 1. In the present example n = 3, thus we are looking 
for j such that Sj = 6. Such a j is 5. The first five columns thus form the first 
part of the partition and there are exactly 2n = 6 elements in the first part (those 
elements are Xi, X2, X3, X4, Y\ and Y2). Now disregard the first five columns from 
the table and start the same procedure afresh. Then the second part is obtained and 
so on. In the following table the vertical lines indicate the different parts obtained 
by the aforementioned procedure: the first two parts have six elements, the third 
and fourth has five elements and the last part consists of one element: 



X 


- 12 3 4 


5 6 7 8 


- 9 - - 


10 11 12 - 




Y 


12--- 


3 4 - - 


5 6 7 8 


9 - 10 11 


12 



From the table, we read the corresponding pieces from the X-side: (1, 4), (5, 8), (9, 9), 
(10, 12), as well as the ones from the Y-side: (1, 2), (3,4), (5, 8), (9, 11), (12, 12). 
The corresponding vectors v and r are thus v = (1, 5, 9, 10, 13, 13), r = (1, 3, 5, 9, 12, 13). 
The number of parts in such a partition is clearly at least k (corresponding to the 
case that all pairs sum up to 2n) and at most [^rj] (corresponding to the case 
that all pairs except the last one sum up to 2n — 1). In our example isr = 5= [^] . 
Now, it is clear that the following claim holds. 

Claim 3.1 Let X, Y be sequences of length kn and let (vr,/i) be an arbitrary align- 
ment of X and Y . Then there exist an integer r such that k < r < \ £n-\ 1 an< ^ an 
r-partition (u, r) of (tt, fj,) such that for every j = 1, . . . , r — 1, it holds 

— fj) + (tj- + i —Tj) £ {2n, 2n — 1} and (is r +i — u r ) + (r r+ i — r r ) < 2n. (3.2) 
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Let, for every r, B r k be the set of vectors v = {y\, . . . , u r+ %) and r = (n, . . . , T r +i) 
satisfying (3.1) and (3.2). Let 



r 2fcn i 
I 2n-l I 

= [J Bl : n- 
r=k 

We shall call the elements of B k) n as the partitions. For every partition (u, t) £ B kn , 
we define 

r 

Lkn(v,T) := y ^Xy.,. . . ,X Uj+1 -i)Y T ., . . . ,Y Tj+1 -l), 
t=l 

where L(X Uj , . . . , F Tj , . . . , X T . +l -\) is the optimal score between X Uj , . . . , 

X Uj+1 -i and Y r . , . . . , y T . +1 _i. The key observation is the following: if (it, fj,) is 
optimal for X,Y and (z/, r) is a r-partition of (vr, then Lfc n = L kn (y,T). By 
Claim |3.1[ every alignment, including the optimal one, has at least one partition 
from the set Bk t n, hence it follows that 

L kn = max L kn (v,T). (3.3) 

0,T)eB fe , n 



Claim 3.2 For every r-partition {v,t) G 

/ ,\ r „ T 1 r 2kn 
E{L kn (v,T)) < -EL 2n < 



2re- 1 



EL2n- 



(3.4) 



Proof. Let (i/, r) G B r k n with r < ■ Let j be such that {vj+i — Uj) + (tj+i - 

Tj) = 2n. Thus, there exists an integer u G {— n, . . . , re} such that z^+i — fj = n — u 
and Tj+i — tj = re + u. Since Xi, X2, ■ ■ ■ , Yi,Yi, . . . are i.i.d., we have 

E(L(X Uj , . . . ,X Uj+1 -i;Y Tj , . . . ,Y Tj+1 -i)) = E(L(X\, . . . ,X n - u ;Yi, . . . ,Y n+u )) = 

E(L(X n - u+ i, . . . , X2n', Y n+u+ i, . . . , Yin)) < -E(L(Xi, . . . , X2n] Yl, . . . , Yin)) = ~jE^2n- 

The last inequality follows from the superadditivity: 

L(Xi, . . . , X n _ u ; Y\, . . . , Y n+U ) + L(X n - u+ \, . . . , X2 n ] Y n+M+ i, . . . , Yn) 

< L(X\, . . . ,X2n',Yi, . . . ,Y2 n ). 

If (fj+i — Vj) + (tj + i — Tj) < 2n, then by the same argument 



E(L(X Uj , . . . ,X Uj+1 -i;Y Tj , . . . , Y Tj+1 -x)) < E(L(Xi, . . . ,X n - u ;Yi, . . . ,Y n+u )) < -EL^n- 



Hence the first inequality in (3.4) follows. The second inequality follows from the 
condition r < [J^;]. ■ 
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3.2 The size of Bk jH and the rate of convergence 

In the following we prove the main theoretical result that links the rate of the 
convergence to the rate at which the number of elements in \Bk n \ grows as k in- 
creases. Our proof is entirely based on McDiarmid's inequality, so let us recall it 
for the sake of completeness: Let Z±, . . . , Z 2m be independent random variables and 
f(Zi,...,Z2m) be a function so that changing one variable changes the value at 
most A. Then for any A > 0, 



P (f(Z 1 ,...,Z 2m ) -Ef(Z u ...,Z 2m ) > A <exp 



A 2 



A 2 



(3.5) 



For the proof, we refer [13]. We apply (3.5 ) with L in the role of / to the independent 



(but not necessarily identically distributed) random variables X\, . . . , X m ,Y\, . . . , Y m 
It is easy but important to see that independently of the value of 5, changing one 



random variable changes the score at most by A so that in our case (3.5) is 

A 2 

P(L m - EL m > A) < exp ' 



■m 



A 2 



(3.6) 



Lemma 3.1 Suppose that for any n and for k big enough 

\&k,n\ < exp[(^(n) +o(k))kri\, 
where ip(n) does not depend on k. Let u(n) > Ay ij){n). Then 

l-hn< u{n) + < u(n) + < u(n) + 



2n- 1 



2n- 1 



2n- 1 



(3.7) 



(3.8) 



Proof. Let (v, r) G B k , n - Recall (3.4). Thus, from (3.6), we get that for any p > 
2kn 



P(L kn {v,T 



1 



2n- 1 



EL 2n > pknj < P\^L kn (v,T)-E(L kn (v,T))pkn) < exp 

(3.9) 



p 2 kn 



From (|3.3|) and (|3.7|) it now follows that, for big k 
P 



( L kn 


1 


2kn 


\ kn 


~ k 


2n-l 



hn > pj < P(L kn (v,T) - i 



(v,r)eBfe,„ 

< |Sfc jn |exp 



2kn 
2n-l 



EL 2n > pkn 



p 2 kn 



< exp {^>{n) + o(k) — ^jkn 



We consider n fixed and let k go to infinity. If u(n) > Ay^ip(n), then there exists 
K(n) < oo so that for every k > K(n), 



\ I u(n)\ 2 1/,. . I ' u(n) 

^{n) + o{k)-[-^\ < -^(n)-l-y 
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Hence, replacing in the inequalities above p with u(n), we obtain for every k > K(n) 





1 


2kn 


\ kn 


~ k 


2n- 1 



p 



where 



hn > u(n) I < exp 



ip(n) 



u(ri) 



nk 



exp[-d n k], 
(3.10) 



d„ 



u(n) 



ip(n) n > 0. 



Now recall the assumption th at 5 < F. Hence for any n and k, the random variable 

»y 



is bounded by F. From (3.10), it thus follows that for any k 



E 



kn 



2kn 
In- 1 



hn + u(n) + Fexp[— d n k\. 



Since lkn — > I as k — > oo and 



2kn 
2n-l 



2n 1 

_ In- 1 + fc' 



we obtain that for any n, 

2n 



/ < 



2n- 1 



hn + «(n) = hn 1 + 



1 

2n- 1 



+ u(n). 



The proof of Theorem 

bound to I 



From Lemma 3.1, it follows that to obtain a 



l n , a suitable estimator of \Bk, n \ satisfying (3.7) should be found. 



Let us estimate |0£ n |. The number of parts in the X side is bounded above by the 
number of combination with repetition from nk + 1 by r — 1. The repetitions allow 
empty parts. When the size of a part in X-side is m, then, except from the last 
part, the size of the corresponding part on Y side has two possibilities: 2n — 1 — m 
or 2n — m. Hence to any r-partition of X-size corresponds at most 2 r_1 2n options 
in Y side. In the following we use the fact that the number of combination with 
repetition from nk + 1 by r — 1 is ( nfc r ^ r 1 _1 ) and for any non-negative integers a > b 
it holds 



< exp 



hp 



where h e (q) 

r < r 2nk 



-qlnq — (1 — q) ln(l — q) is the binary entropy function. Since 



2n-l 



implies that r — 1 < 



2nk 



2n-l ' 



we thus have for n > 2 



\B r ki J < (2^ 1 2n) 

< exp 

< exp 

< exp 



nk + r — 1 



r — 1 

(r- l)(ln2) +ln(2n) + h 
ln4 ln(2n 



r- 1 



2n- 1 



2n — 1 nk 
In 4 ln(2n 



nk + r — 1 
r-1 



(nk + r — 1) 
2 

1 + 



rzfc + r — 1/V 2n — 1 

2 \ /2n + l , , , 

nk 



nk 



2n+l \2n-l 
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The last inequality follows from the inequalities 

2nk 



r — 1 



< 



2n-l 



nk + r-l ~ nk+ J^- 2n + 1 



so that if n > 2, then 2 n+i — ^-5 anc ^ 



r — 1 



nk + r — 1 



2n + 1 



Hence 
\Bk,n\ < 



2n/c 
2ra- 1 
k 

2n- 1 



— k + 2 j exp 
+ 2 ) exp 



In 4 ln(2n) , 



2n-l 



In 4 ln(2n) , 



2n- 1 



nk 



2n + 1 
2n+ iy V2n - 1 
2 \ /2n+l 



nk 



exp 



exp 



In 



A: 



In- 1 



+ 2 + 



In 4 ln(2n) , 



2n + 1 / \2n - 1 

2 \ /2n + l 



nk 



2n- 1 



In 



2n- 



T +2 +ln(2n) ln4 



+ 



2n- 1 



+ /i e 



2n + iy v 2n ~ 1 

2 \ /2n + l 



nA; 



2n + 1 / V 2n - 1 



nA; 



= exp 
< exp 



2n- 1 



2n + 1 / V 2n - 1 



2n + 1 



nA; 



o(A;) + 



2n + 1 



2n- 1 V 2n - 1 
where the last inequality follows from the inequality 



+ ln(2ra - r 



nk 



2n + 1 



< 



2n + 1 \2n-l 



2n+l (2n-\ 
+ ln 



(3.11) 



Hence (3.7) holds with 

ip(n) 



2n - 1 V 2n - 1 



2n + 1 



+ ln(2n - 1) 



The inequality (1.2) now follows from Lemma 3.1 
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